N-layer Approach to Web Information Retrieval
نویسنده
چکیده
In web information retrieval, the terms or keywords are used for indexing purpose of document. These terms or keywords appear in special location such as title, subtitle, header, hyperlinks and so on. Vector space model ignores the importance of these terms with respect to their position while calculating the weight of the indexing terms. The effectiveness of the vector space model crucially depends on the weights applied to the terms of the document vectors. These weights are found using a term weight evaluation scheme based on the frequency of the terms in the document and the collection. Terms that occur more often in a document are treated as more important whereas terms that occur less frequently throughout a collection are given a higher weight. In N-level Vector space approach, the importance of these terms with respect to their position is considered. The web document is logically divided in N-layer considering the structure of web document and weights are assigned to terms based on their presence in different layer within the document. Different weight evaluation schemes proposed for vector space models are applied to N-level vector space model and are compared. N-layer vector space model gives better result as compare to vector space model. Cosine similarity and all six weight evaluation methods that are formed using different local weights and global weights show that average precision and average recall in case of N-layer vector space model is always better than vector space model.
منابع مشابه
Behavioral Considerations in Developing Web Information Systems: User-centered Design Agenda
The current paper explores designing a web information retrieval system regarding the searching behavior of users in real and everyday life. Designing an information system that is closely linked to human behavior is equally important for providers and the end users. From an Information Science point of view, four approaches in designing information retrieval systems were identified as system-...
متن کاملبازیابی اطلاعات تصویری حوزهی سلامت در وب از دیدگاه متخصصان علوم پزشکی:یک مطالعه کیفی
Introduction: The medical image as a source of non-textual information has an important role in the field of medicine. Since the quality of life is directly related to health, employing this type of information is effective in improving the practice of health professionals. This study was aimed to survey medical image retrieval in the Web from the perspective of experts in medical sciences. M...
متن کاملAssessing the Internal Structure of the Ellis Information Retrieval Model in Order to Present the Persian Norm of Web Retrieval Tools
Introduction: Study evaluated the internal structure of Ellis information seeking model in the student community with the aim of presenting the Persian norm. Methods: This is a descriptive-analytical study conducted by cross-sectional survey method in the second semester of the academic year 1399-1400. Population comprise of 280 graduate students at Ahvaz Jundishapur University of Medical Scien...
متن کاملA Layered Approach to NLP-Based Information Retrieval
A layered approach to information retrieval permits the inclusion of multiple search engines as well as nmltiple databases, with a natural language layer to convert English queries for use by the various search engines. The NI,P layer incorporates morphological analysis, noun phrase syntax, and semantic expansion based on WordNet.
متن کاملPerformance Analysis of Layered Vector Space Model in Web Information Retrieval
Information on the web is growing exponentially. The unprecedented growth of available information coupled with the vast number of available online activities. It has introduced a new wrinkle to the problem of web search. It is difficult to retrieve relevant information. In this context search engines have become a valuable tool for users to retrieve relevant information. Finding relevant infor...
متن کامل